home *** CD-ROM | disk | FTP | other *** search
Text File | 1998-09-09 | 10.5 KB | 301 lines | [TEXT/CWIE] |
- ========================================================================
- Metrowerks PPC Release Notes
- ========================================================================
-
- Version: 2.2 ( __MWERKS__ == 0x2200 )
- Date: August 25, 1998
- Author: Bob Campbell
-
- ========================================================================
- New Features in this Version
- ========================================================================
-
- * The global optimizer is now controled by the Global Optimizations
- panel and is set by levels (0-4). There are pragmas to disable
- some optimizations if needed. The following pragmas work for
- PowerPC, 68K and x86:
-
- #pragma opt_common_subs --> control common subexpression elimination
- #pragma opt_loop_invariants --> control loop invariant removal
- #pragma opt_strength_reduction --> control loop strenght reduction
- #pragma opt_propagation --> control copy and constant propagation
- #pragma opt_lifetimes --> control lifetime analysis
- #pragma opt_deadcode --> control dead code elimination
- #pragma opt_dead_assignments --> remove dead assignments
-
- In addition there are 3 new powerpc specific pragmas which default
- to the following (more details on these below).
-
- #pragma ppc_unroll_instructions_limit 100
- #pragma ppc_unroll_factor_limit 10
- #pragma ppc_unroll_speculative off
-
-
- * Implement branchless compares which don't use the condition code
- fields. These optmizations are explained in "The PowerPC(tm)
- Compiler Writer's Guide" Appendix D (D.1 Comparisons and
- Comparisons Aginst Zero)
-
- a = b == c;
-
- * Implement "!" with-out branches if the expression is not a "&&" or
- "||". In effect "!" for a value in a register is the same as r == 0.
- The code generated is:
-
- cntlwz Ry,Rvalue
- srwi Rresult,Ry
-
- * Check for the "!!" case (which sometimes happens with inlining)
- !(!(x)) is the same as (x) iff x is a logical expression.
-
- * Optimize 16 or 8 bit math to remove the EXTS[HB] or RLWINM instructions
- when it can be proven that they are not needed (LHZ does not need to be
- followed by a RLWINM, and LHA does not need to be followed by a EXTSH
- etc.)
-
- * Extensive Optimization Improvements
- + Loop unrolling for fixed count loops has been extended to
- handle loops where the body of the loop contains conditional
- code.
-
- + Controls for loop unrolling have been make externally setable
- using the pragmas "ppc_unroll_instructions_limit" and
- "ppc_unroll_factor_limit". The default values for these are:
-
- #pragma ppc_unroll_instructions_limit 100
- #pragma ppc_unroll_factor_limit 10
-
- The factor limit controls the max number of copies of the
- loop body which will be generated. The instruction limit
- controls the total size of the unrolled loop body. (It should
- be noted that even if the limit is 100 the result will normally
- be smaller as other optimizations generally will reduce the size
- of the loop body after it has been unrolled).
-
- + Speculative unrolling, for a detected counting loop, where the
- number of iterations is not a compile time constant but can
- be calculated at runtime, the loop is speculatively unrolled.
- For this to work the loop counter must be a 32 bit value (int,
- long, unsigned int, unsigned long), the body of the loop must
- not contain any conditional code.
-
- This feature can be disabled by using the pragma
- "ppc_unroll_speculative"
-
- #pragma ppc_unroll_speculative off
-
- The unroll factor for speculative unrolling will be a power of
- 2 (so if unroll factor limit is 10, it will try 8, 4 and 2).
-
- + Loops containing a single conditional which is loop invariant
- are unswitched:
-
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- if (loop-invariant-expression) {
- ... true statements ...
- } else {
- ... false statements ...
- }
- ... post if statements ...
- }
-
- becomes:
-
- if (loop-invariant-expression) {
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- ... true statements ...
- ... post if statements ...
- }
- } else {
- for (i = init; i < limit; i++) {
- ... pre if statements ...
- ... false statements ...
- ... post if statements ...
- }
- }
-
- + Treat as counting loops those loops where the condition is BNE
- provided the increment is 1 or -1.
-
- + Improve handling of induction variables, detect nested induction
- variables which can be merged
-
- int a[10][10];
-
- for (i = 0; i < 10; i++) {
- for (j = 0; j < 10; j++) {
- a[i][j]
- }
- }
-
- detect that a[i][j] is a single induction variable (instead of
- two (a[i], and a[i][j])). This only works if the loop is over
- all elemements of the first subscript (otherwise the value of
- a[i] needs to be recomputed at the beginning of the inner loop).
-
- Also detect induction variables which are simple offsets from
- each other (like fields of a structure). If possible update
- the load/store instructions to use a single induction variable
- with the offsets encoded into the load/store instruction.
-
- + Extended Constant Propagation to handle ADD => ADDI, OR => ORI,
- SUB => ADDI or SUBFIC (depending on which paramter is constant).
- (These patterns were handled in the peephole, but Constant
- Propagation works across blocks).
-
- + ADDI ... Load/Store; when ever possible the constant part of an
- ADDI is propagated into the Load/Store instruction, and the ADDI
- is defered as late in the basic block as possible. For example
-
- int *p;
-
- *p++ = 0; *p++ = 0; *p++ = 0; *p++ = 0;
-
- Produces something like:
-
- LI R0,0
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
- STW R0,0(Rp)
- ADDI Rp,Rp,4
-
- This is now optimized to:
-
- LI R0,0
- STW R0,0(Rp)
- STW R0,4(Rp)
- STW R0,8(Rp)
- STW R0,12(Rp)
- ADDI Rp,Rp,12
-
- When possible the last add will be deleted if the value is not live,
- if the value is live and the ADDI can be folded into the last Load
- or Store the update form of the load/store will be used. This general
- optimization will work on other forms (so *--p, *++p etc will be
- handled).
-
- + Peephole optimizer now keeps track of the usage of condition code
- fields so that compares when the result would be constant can be
- removed. In the following example the BEQ can become a "B" or be
- deleted, in addition the CMP (and perhaps the LI) can be deleted
- if their results are not live.
-
- LI Rx,k; CMP[L]I CRx,Rx,l; B[EQ|NE] CRx,label
- ==>
- if branch condition is true:
- B label
- if branch condition is false:
- B nextblock
-
- *Note if the branch is to the next block the final assembly pass
- will optimize the branch away
-
- + Added peephole patterns:
- LBZX Ry,(Rx,Rw); RLWINM Rz,Ry,0,a,31; => LBZX Rz,(Rx,Rw)
- iff a <= 24, and Ry is not live after the RLWINM
-
- LHZX Ry,(Rx,Rw); RLWINM Rz,Ry,0,a,31; => LHZX Rz,(Rx,Rw)
- iff a <= 16, and Ry is not live after the RLWINM
-
- NOT Ry,Rx; AND Rz,Rw,Ry => ANDC Rz,Rw,Rx
- iff Ry is not live after the ADN.
-
- + Extensive improvements for ADDI ... Load/Store patterns
-
- ========================================================================
- Bugs Fixed in This Version
- ========================================================================
-
- * MW09350: Unrolling loops which contain a conditional
- expression which has a "continue" (or no else and code
- after the conditional part of the if):
-
- for (i = 0; i < 4; i++) {
- if (cond || cond) {
-
- }
- }
-
- * MW09249: in constant propagation incorrectly converted
-
- LI Rx,0; ... SUBF Rz,Ry,Rx ===>>> MR Rz,Rx
-
- it should have been
-
- NEG Rz,Rx
-
- * MW09181: counting loop with post decrement (or increment) when
- converting the loop (because it is known that the loop will always
- be executed once) the instructions following the compare where not
- copied into the "preheader".
-
- * MW09120 PowerPlant project won't link with PPC compiler 2.2 build 27
- The optimizer detected an invariant conditional in a loop and
- unswitched the loop, however the loop contained a call (which meant
- that there is little if any improvement with unswitching) and the loop
- was converted in such a way as to confuse the scheduler. The code now
- checks for calls (and instructions with side effects).
-
- * MW09048: Re : BUG - C++ PPC 2.2b1 + optimizations
- Fixed a bug in speculative loop unrolling when the loop counter
- was not 1 and the loop counter is not referenced in the loop body.
-
- * Fixed a bug handling the pattern
- ADD rX,rY,rZ; ADDI rW,rX,0 ==> ADD rW,rY,rZ
-
- * MW08931 Out-of-line traceback tables causes ICE
-
- * MW08861 fixes a bug which prevented disabling puttting small static
- data in the TOC.
-
- * MW08762, MW08727 PPC backend's parser for pragmas was incorrectly
- complaining about missing EOL after unknown pragmas
-
- * MW08761: Incorrect internal error when trying to optimize small
- local arrays to registers.
-
- * MW08611 fixes a bug in structure copy using doubles (when one of the
- operands being copied is an array reference).
-
- * MW08481 fixes a bug in copy propagation which was confusing the offsets
- of parameters (only effects parmeters not passed in registers)
-
- * fixes a bug in global register allocation at optimization level 2
- (with only the global optimizer "Speed" checkbox set). This bug
- only effects C++ functions with a inlined function which has to
- destroy an object in response to an exception (and even then only
- for some rare cases).
-
- * Handle a conditional expression where one of the values is a throw:
-
- var = expr ? value : throw error;
-
- * Fix the parameter types on __memcpy() and __strcpy()
-
- * Handle using !varaible as a paramter so it only generates
- two instructions (instead of 4).
-
- ========================================================================
- Contacting Metrowerks
- ========================================================================
-
- For bug reports, technical questions, and suggestions, please use the
- forms in the Release Notes folder on the CD, and send them to
-
- support@metrowerks.com
-
- See the CodeWarrior on the Nets document in the Release Notes folder for
- more contact information, including a list of Internet newsgroups,
- online services, and patch and update sites.
-
- ========================================================================
- Bob Campbell, Andy Nicolas
- CodeWarrior C/C++ PowerPC Engineering Team
- Metrowerks Coporation
-